A novel k-word relative measure for sequence comparison

نویسندگان

  • Jie Tang
  • Keru Hua
  • Mengye Chen
  • Ruiming Zhang
  • Xiaoli Xie
چکیده

In order to extract phylogenetic information from DNA sequences, the new normalized k-word average relative distance is proposed in this paper. The proposed measure was tested by discriminate analysis and phylogenetic analysis. The phylogenetic trees based on the Manhattan distance measure are reconstructed with k ranging from 1 to 12. At the same time, a new method is suggested to reduce the matrix dimension, can greatly lessen the amount of calculation and operation time. The experimental assessment demonstrated that our measure was efficient. What's more, comparing with other methods' results shows that our method is feasible and powerful for phylogenetic analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison

MOTIVATION Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigat...

متن کامل

Global Sequence Homology Detection Using Word Conservation Probability

Protein homology detection is an important issue in comparative genomics. Because of the exponential growth of sequence databases, fast and efficient homology detection tools are urgently needed. Currently, for homology detection, sequence comparison methods using local alignment such as BLAST are generally used as they give a reasonable measure for sequence similarity. However, these methods h...

متن کامل

A Novel Method for Detection of Epilepsy in Short and Noisy EEG Signals Using Ordinal Pattern Analysis

Introduction: In this paper, a novel complexity measure is proposed to detect dynamical changes in nonlinear systems using ordinal pattern analysis of time series data taken from the system. Epilepsy is considered as a dynamical change in nonlinear and complex brain system. The ability of the proposed measure for characterizing the normal and epileptic EEG signals when the signal is short or is...

متن کامل

A Comparative Study of an Unsupervised Word Sense Disambiguation Approach

Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. This is a significant problem in the biomedical domain where a single word may be used to describe a gene, protein, or abbreviation. In this paper, we evaluate SENSATIONAL, a novel unsupervised WSD technique, in comparison with two popular learning algorithms: support vector machines...

متن کامل

A Quantitative Assessment of SENSATIONAL Quantitative Assessment of SENSATIONAL

Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. This is a significant problem in the biomedical domain where a single word may be used to describe a gene, protein, or abbreviation. In this paper, we evaluate SENSATIONAL, a novel unsupervised WSD technique, in comparison with two popular learning algorithms, support vector machines...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational biology and chemistry

دوره 53PB  شماره 

صفحات  -

تاریخ انتشار 2014